In mathematical optimization theory, the simplex algorithm, created by the American mathematician George Dantzig in 1947, is a popular algorithm for numerically solving linear programming problems. The journal Computing in Science and Engineering listed it as one of the top 10 algorithms of the century.[1]
The method uses the concept of a simplex, which is a polytope of N + 1 vertices in N dimensions: a line segment in one dimension, a triangle in two dimensions, a tetrahedron in three-dimensional space and so forth.
Contents |
Consider a linear programming problem,
with the variables of the problem, a vector representing the linear form to optimize, A a rectangular p,n matrix and the linear constraints.
In geometric terms, each inequality specifies a half-space in n-dimensional Euclidean space, and their intersection is the set of all feasible values the variables can take. The region is convex and either empty, unbounded, or a polytope.
The set of points where the objective function obtains a given value v is defined by the hyperplane . We are looking for the largest v such that the hyperplane still intersects the feasible region. As v increases, the hyperplane translates in the direction of the vector c. Intuitively, and indeed it can be shown by convexity, the last hyperplane to intersect the feasible region will either just graze a vertex of the polytope, or a whole edge or face. In the latter two cases, it is still the case that the endpoints of the edge or face will achieve the optimum value. Thus, the optimum value will always be achieved on (at least) one of the vertices of the polytope.
The simplex algorithm applies this insight by walking along edges of the (possibly unbounded) polytope to vertices with higher objective function value. When a local maximum is reached, by convexity it is also the global maximum and the algorithm terminates. It also finishes when an unbounded edge is visited, concluding that the problem has no solution. The algorithm always terminates because the number of vertices in the polytope is finite; moreover since we jump between vertices always in the same direction (that of the objective function), we hope that the number of vertices visited will be small. Usually there are more than one adjacent vertices which improve the objective function, so a pivot rule must be specified to determine which vertex to pick. The selection of this rule has a great impact on the runtime of the algorithm.
In 1972, Klee and Minty[2] gave an example of a linear programming problem in which the polytope P is a distortion of an n-dimensional cube. They showed that the simplex method as formulated by Dantzig visits all 2n vertices before arriving at the optimal vertex. This shows that the worst-case complexity of the algorithm is exponential time. Since then, for almost every deterministic rule, it has been shown that there is a family of simplices on which it performs badly. It is an open question if there is a pivot rule with polynomial time, or even sub-exponential worst-case complexity.
Nevertheless, the simplex method is remarkably efficient in practice. It has been known since the 1970s that it has polynomial-time average-case complexity under various distributions. These results on "random" matrices still didn't quite capture the desired intuition that the method works well on "typical" matrices. In 2001 Spielman and Teng introduced the notion of smoothed complexity to provide a more realistic analysis of the performance of algorithms.[3]
Other algorithms for solving linear programming problems are described in the linear programming article.
The simplex algorithm requires the linear programming problem to be in augmented form, so that the inequalities are replaced by equalities. The problem can then be written as follows in matrix form:
where are the introduced slack variables from the augmentation process, ie the non-negative distances between the point and the hyperplanes representing the linear constraints A.
By definition, the vertices of the feasible region are each at the intersection of n hyperplanes (either from A or ). This corresponds to n zeros in the n+p variables of (x, xs). Thus 2 feasible vertices are adjacent when they share n-1 zeros in the variables of (x, xs). This is the interest of the augmented form notations : vertices and jumps along edges of the polytope are easy to write.
The simplex algorithm goes as follows :
Rewrite and :
x=0 is clearly a feasible vertex so we start off with it : the 3 current non-basic variables are x, y and z. Luckily Z is already expressed as an affine function of x,y,z so no transvections need to be done at this step. Here Z 's greatest coefficient is -4, so the direction with highest gradient is z. We then need to compute the coordinates of the vertex we jump to increasing z. That will result in setting either s or t to 0 and the correct one must be found. For s, the change in z equals 10/1=10 ; for t it is 15/3=5. t is the correct one because it minimizes that change. The new vertex is thus x=y=t=0. Rewrite Z as an affine function of these new non-basic variables :
Now all coefficients on the first row of the matrix have become nonnegative. That means Z cannot be improved by increasing any of the current non-basic variables. The algorithm terminates and the solution is the vertex x=y=t=0. On that vertex Z=20 and this is its maximum value. Usually the coordinates of the solution vertex are needed in the standard variables x, y, z ; so a little substitution yields x=0, y=0 and z=5.
The tableau form used above to describe the algorithm lends itself to an immediate implementation in which the tableau is maintained as a rectangular (m+1)-by-(m+n+1) array. It is straightforward to avoid storing the m explicit columns of the identity matrix that will occur within the tableau by virtue of B being a subset of the columns of . This implementation is referred to as the standard simplex method. The storage and computation overhead are such that the standard simplex method is a prohibitively expensive approach to solving large linear programming problems.
In each simplex iteration, the only data required are the first row of the tableau, the (pivotal) column of the tableau corresponding to the entering variable and the right-hand-side. The latter can be updated using the pivotal column and the first row of the tableau can be updated using the (pivotal) row corresponding to the leaving variable. Both the pivotal column and pivotal row may be computed directly using the solutions of linear systems of equations involving the matrix B and a matrix-vector product using A. These observations motivate the revised simplex method, for which implementations are distinguished by their invertible representation of B.
In large linear programming problems A is typically a sparse matrix and, when the resulting sparsity of B is exploited when maintaining its invertible representation, the revised simplex method is a vastly more efficient solution procedure than the standard simplex method. Commercial simplex solvers are based on the primal (or dual) revised simplex method.
|